Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Wu, Xiaoshi; Sun, Keqiang; Zhu, Feng; Zhao, Rui; Li, Hongsheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2303.14420 (cs)

[Submitted on 25 Mar 2023 (v1), last revised 22 Aug 2023 (this version, v2)]

Title:Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Authors:Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li

View PDF

Abstract:Recent years have witnessed a rapid growth of deep generative models, with text-to-image models gaining significant attention from the public. However, existing models often generate images that do not align well with human preferences, such as awkward combinations of limbs and facial expressions. To address this issue, we collect a dataset of human choices on generated images from the Stable Foundation Discord channel. Our experiments demonstrate that current evaluation metrics for generative models do not correlate well with human choices. Thus, we train a human preference classifier with the collected dataset and derive a Human Preference Score (HPS) based on the classifier. Using HPS, we propose a simple yet effective method to adapt Stable Diffusion to better align with human preferences. Our experiments show that HPS outperforms CLIP in predicting human choices and has good generalization capability toward images generated from other models. By tuning Stable Diffusion with the guidance of HPS, the adapted model is able to generate images that are more preferred by human users. The project page is available here: this https URL .

Comments:	Accepted by ICCV 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2303.14420 [cs.CV]
	(or arXiv:2303.14420v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2303.14420

Submission history

From: Xiaoshi Wu [view email]
[v1] Sat, 25 Mar 2023 10:09:03 UTC (12,155 KB)
[v2] Tue, 22 Aug 2023 12:26:07 UTC (14,369 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators